A NEW METHOD FOR FAST COMPUTING UNBIASED 
ESTIMATORS OF CUMULANTS 
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Abstract. We propose new algorithms for generating fc-statistics, multivari- 
ate fc-statistics, polykays and multivariate polykays. The resulting computa- 
tional times are very fast compared with procedures existing in the literature. 
Such speeding up is obtained by means of a symbolic method arising from the 
classical umbral calculus. The classical umbral calculus is a light syntax that 
involves only elementary rules to managing sequences of numbers or polyno- 
mials. The cornerstone of the procedures here introduced is the connection 
between cumulants of a random variable and a suitable compound Poisson 
random variable. Such a connection holds also for multivariate random vari- 
ables. 



1. Introduction 

In the last decades, symbolic m ethods h ave been successfully used in differ- 
ent mathematical areas ( Grossmarj . Il989: W ang and Zh eng. 2005). Symbolic tech- 
niqi ies have been recently used i n problems arising from comput ational statistics 
(see 



Andrews and Stafford! (l2 000').'Ze ilbergeiJ(|2004[) ). The papers (JDi Nardo and Senatol 



2006bD and (|Di Nardo et al.l . 2008b) lie within this field. In these papers, the the- 
ory of fc-statistics and polykays was completely rewritten, carrying out a unifying 
framework for these estimators, both i n the univariate and multivariate cases. 

This subject goes back to Fisher (|l929h and up to today it was tre a ted by 
means of different languages. Main references are IStuart and OrdI (|l987l ). ISpeed 
( 1983 , 19861 ). McCuUaghl (|l987 ). A more accurate list of references can be found in 
Di Nardo et al. (2008b). 

The umbral techniques, investigated in lOi Nardo et al.l (|2008bl) . have allowed us 
to implement a single algo rithm for fc-statistics, m ultivariate fc-statistics, polykays 



and multivariate polykays (jDi Nardo et al.l . l2008a[ ). Nevertheless, the elegance of a 



unifying outlook pays a price in computationa l costs that become comparable with 
those of MathStatica faose and Smithl . l2002f ) for polykays and not competitive for 
univariate and multivariate fc-statistics. 

By developing the ideas introduced in lPi Nardo et al.l (|2008b[ l for fc-statistics, in 
this paper we introduce radically innovative procedures for generating all these es- 
timators, by realizing a substantial improvement of computational times compared 
with those in the literature. 

A frequently asked question is: why are these calculations relevant? Usually 
higher order objects require enormous amounts of data to estimate with any accu- 
racy. Nevertheless, there are different areas, such as astronomy, astrophysics and 
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biophysics, which ne ed to compute high o rder fc-statistics in order to recognizing a 



gaussian population (JFerreira et al.l . ll997t ). Undoubtedly, an enjoyable challenge is 



to have efficient procedures to deal with the involved huge amount of algebraic and 
symbolic computations. 

The algorithm s prop osed here are based on the umbral language introduced by 
iRpta and Tavlor 11 9941 ) . Ap plications of the classical umbral calculus are given 



in Zeilberger (200Q) -^ (J2002l ). where generating functions are computed for many 
difficult problems dealing with counting combinatorial objects. Applications to 
bi linear generating funct ions for polynomial sequences are given in Gessel (2003). 



Di Nardo and Senatq (2001 ) have developed the classical umbral calculus (1994) . 



with special care to probabilistic aspects. The basic device is the representation of 
a unital sequence of numbers or polynomials by a symbol a, called an umbra. The 
umbra a is related to these unital sequences via an operator E that resembles the 
expectation operator of random variables (r.v.'s). This symb olic method provides a 



ligh t syntax for handling cumulants and factorial moments (jDi Nardo and Senatol . 
|2006aj), fc-statistics then come in hand since these a r e the u nique symmetric unbi- 
ased estimators of cumulants ( Di Nardo and Senatol . 12006 a ). These estimators are 



expressed in terms of power sums in the variables of the random sample. 

After recalling basic notions of the umbral language, in Section 3 we show that 
any cumulant can be evaluated via cumulants of a suitable umbra. In probabilistic 
terms, cumulants of a r.v. can be obtained via cumulants of a suitable compound 
Poisson r.v. This link allows us the significant speed up of the algorithms for 
fc-statistics and polykays. For multivariate cumulants, the basic tool is given by 
umbrae indexed by multisets . In Section 4, we rec all this symbolic device, in- 
troduced and largely used in iDi Nardo et al.l ( 2008b[ ). In Section 5, we show the 



connection between multivariate cumulants of an umbra and a suitable multivariate 
compound Poisson r.v. We then summarize the algorithms for generating multi- 
variate fc-statistics and polykays. Finally, we compare the computational times o f 
the algorithr ns proposed here with those of MathStatica teose and Smithl . l2002f ) 



and those of [Andrews and Stafford! ( 20001 ). Note that MathStatica does not have 



a procedure for multivariate polykays. All programs have been executed on a PC 
Pentium(R)4 Intel(R), CPU 2.08 Ghz, 512MB Ram with Maple version 10.0 and 
Mathematica version 4.2. We choose the Maple language due to its acknowledged 
plainness in translating symbolic computations. 

2. Background to umbral calculus 

This section is aimed to recalling notation and terminology useful to handle um- 
brae. More details and technicalities can be found in Di Nardo and Senato (200l|, 



2006aD 



Formally, an umbral calculus is a syntax consisting of the following data: 

i) a set A = {a, (3, . . .}, called the alphabet, whose elements are named umbrae; 
a) a commutative integral domain R whose quotient field is of characteristic 

zercQ; 
Hi) a linear functional E, called the evaluation, defined on the polynomial ring 
R[A\ and taking values in R such that 
a) E[l] = 1; 



For statistical applications, R is the field of real numbers. 
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b) E[a^(3^ • • • 7''] = i?[a*]i?[/3-'] • • • E[^''] for any set of distinct umbrae in 
A and for i,j,...,k nonnegative integers [uncorrelation property); 
iv) an element e e A, called the augmentation, such that _E[£"] = for every 

n > 1; 
v) an element m G A, called the unity umbra, such that _E[u"] — 1, for every 
n > 1. 

An umbral polynomial is a polynomial p e i?[74]. The support of p is the set of 
all umbrae occurring in p. If p and q are two umbral polynomials, p and q are 
uncorrelated if and only if their supports are disjoint. The umbral polynomials p 
and q are umhrally equivalent if and only if 

E[p\ — E[q\. in symbols p ~ q. 

The moments of an umbra a are the elements a„ G R such that 

£;[«"] =a„, Vn>0 

and we say that the umbra a represents the sequence of moments 1, ai, 02, . . . . 

It is possible that two distinct umbrae represent the same sequence of moments, 
in this case they are called similar umbrae. More formally, two umbrae a and 7 are 
said to be similar when 

E[a^] = £:[7"] Vn > 0, in symbols a = 7. 

In addition, given a sequence l,ai,a2, . . . in R there are infinitely many distinct 
and thus similar umbrae representing the sequence. 

The factorial moments of an umbra a are the elements «(„) g R corresponding 
to umbral polynomials (a)„ = a{a — 1) • • • (a — n + 1), for each n > 1 via the 
evaluation E, that is i5[(a)„] = «(„). 

Two more special umbrae have been defined in the alphabet A: the singleton 
umbra x and the Bell umbra (3. 

The singleton umbr a y is the umbra whose momen ts are all zero, except the first 
E[x\ = 1. As shown in lDi Nardo and Senatol ( 2006a ). its factorial moments are 



(2.1) i?[(x)„]=x(„) = (-l)"-i(7i-l)!, Vn>l. 

The Bell umbra j3 is the umbra whose factorial moments are all equal to 1, 

(2.2) £;[(/?)„]= 6(„) = 1, Vn>l. 

Its moments are the Bell numbers. The umbra is therefore the u mbral counterpart 
of a Poisson r.v. with parameter 1 ( Di Nardo and Senatol . I2OO 11 ). 



Thanks to the notion of similar umbrae, it is possible to extend the alphabet 
A with the so-called auxiliary umbrae, obtained via operations among similar um- 
brae. As a consequence, a saturated umbral calculus ca n be constructed where 
auxiliary umbrae are treated as elements of the alphabet ( Rota and Taylon . 1 19941 ) . 



Let {oil, a2, ■ ■ ■ , ctn} be a set of n uncorrelated umbrae, similar to an umbra a. The 
symbol n.a denotes an auxiliary umbra similar to the sum ai + a2 + ■ ■ ■ + ctn and 
called the dot product between the integer n and the um bra a. Powers oi n.a ar e 
umbrally equivalent to the following umbral polynomials ( Di Nardo et al.l . l2008b[) : 



(2.3) {n.ay ~ J2^n),^dxax, 



Xhi 
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where the sum is over all partitions □ A — (l''"i,2'"2, . . .) of the integer i, (n),j^ = 
for v\> n and 

(2-4) ^^ - ;:^ (I!)^7(^!)^7TT: --d a. ^ (a.J- (a^'^^ " ' ' - 

with {ji} distinct integers chosen in {1, 2, . . . , n\. By evaluating equivalence (|2.3[) 
via the linear functional -E, we have 

(2.5) E\{n.af\ = ^(n),,dAaA, 

Ahi 

where a\ = a^ a^ ■ ■ ■ . Note that if A = (f^ 2''^, . . .) is a partition of the integer 
r,^ = (l''i,2'*=,...) is a partition of the integer s and A + 77 = (fi+^S 2''2+^^ . . .), 
then 

flA+j) = flAflr, and QfA+r, - aAQ!??- 

Properties of the auxiliary umbra n.ct have been extensively described in' Di Nardo and Senate 



(|2001il and these will be recal led whenever it is nece s sary. I t is interesting to remark 



that a.n = na, as proved in iDi Nardo and Senatol (j2006af ). in agreement with the 
meaning of the dot product. 

A feature of the classical umbral calculus is the construction of new auxiliary 
umbrae by suitable symbolic replacements. For example, if we replace the integer 
n in n. a by an umbra 7, equivalence (j2.3p gives 

(2.6) i^.ay ^Y^^jW^xax. 

Ahi 

Equivalence (12. 6p has been formally prov ed by using the notion of gen erating func- 
tion of an umbra, for further details see ( Di Nardo and Senatol . 120011 ). Note that. 



contrary to what happens with n.a, in the dot product a.n the substitution of n 
with an umbra 7 does not inherit the symbolic expression of moments. As it is 
straightforward to show via (|2.6p . the dot product is therefore not commutative. 
This circumstance justifies also the falling off of the right distributive law in the 
dot product, so that 

(a + (5).7 = a.7 + (5-7 whereas ^.{a + S) ^ ^.a + "f.S, 

for a, 7, 5 G A. Actually, by considering the parallelism with the r.v.'s theory, the 
dot product j.a corresponds to a random sum, and the right distributive law falls 
off similarly to what it happens for random sums. 

In the dot product 7.0;, by replacing the umbra 7 by the umbra j.(3, we obtain 
the so-called composition umbra of a and 7, that is j.(3.a, whose powers are 

(2.7) (7./3.a)^~^7''-dAaA. 

Ahi 

The compositional inverse of an umbra a is the umbra a^^^^ satisfying 

a<-^>.(3.a = a.(3.a<-^> = x- 



Recall that a partition of an integer i is a sequence A = (Ai, A2, . . . , At), where Aj are weakly 
decreasing integers and J^^_i Aj = i. The integers Aj are named parts of A. By the symbol ux we 
denote the length of A, that is the number of its parts. A different notation is A = (fi , 2''2 ,...), 
where rj is the number of parts of A equal to j and ri+r2 + - ■ ■ = iy\. We use the classical notation 
A h i to denoting "A is a partition of i" . 
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In the following examples, some other fundamental auxiliary umbrae are character- 
iz ed by means of equivalence (|2. 6 |1 . The properties we are going to recall are proved 
in lDi Nardo and Senatol (|2006ai bh. 



Example 2.1. The a-partition umbra. If /? is the Bell umbra, the umbra p.a is 
called the a-partition umbra. By taking into account (|2.6p and (|2.2p . its powers 
are 

(2.8) (/3.a)^~^dAaA. 

By equivalence (|2.8p . we have 

(2.9) f3.u<-'>=x, P.X = u, 

where m^~^^ denotes the compositional inverse of u. The umbra p.a corresponds 
to a compound Poisson r.v. of parameter 1. 

Example 2.2. The a-cumulant umbra. If x is the singleton umbra, the umbra x.a 
is called the a-cumulant umbra. By virtue of equivalence (|2.6p , its powers are 

(2.10) {x.af ~^X(^^)dxax, 

Ahi 

where xiy^\ are the factorial moments (|2.ip of the umbra x- By taking into account 
(I2.10p . the following equivalences follow 

X-P = u, x-X = u^~^^- 

Equivalence ()2.10p recalls the well-known expression of cumulants ki,K2,... in 
terms of moments oi, 02, . . . of a r.v. 

(2.11) AC, = ^(-l)''^-i(;.A-l)!rfAaA. 

Ahi 

The moments of the a-cumulant umbra x.a are therefore called cumulants of the 
umbra a. 

Example 2.3. The a-factorial umbra. The umbra a.x is called the a-factorial um- 
bra and its moments are the factorial moments of a, since the following equivalence 
holds 

{a.xT ^ {a)^. 

The commutative integral domain R may be replaced by a polynomial ring in 
any number of indeterminates, having coefficient in a field K of characteristic zero. 
Suppose therefore to replace R with the polynomial ring K[y], where y is an inde- 
terminate. The uncorrelation property Hi) must be rewritten as 

E[l] = 1; E[y^a^l3^ • • •] = y^ E[a'']E[l3^] ■ ■ ■ 

for any set of distinct umbrae in A, and nonnegative integers j. A;, ^, .... In if [2/][A] 
an umbra is said to be a scalar umbra when its moments are elements of K, while it 
is said to be a polynomial umbra if its moments are polynomials of K[y]. A sequence 
of polynomials po,pi, . . . G K[y] is umbrally represented by a polynomial umbra 
if and only if po = 1 and p„ is of degree n for every nonnegative integer n. If we 
replace the integer n by the indeterminate y in (|2.5p . then 

(2.12) E[{y.ay] = Y.^y),,dxax, 

X\-i 
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where {y)ux denotes the lower factorial polynomials in K[y]. 

Some other auxiliary umbrae will be used in the following. The symbol a" is 
an auxiliary umbra denoting the product ai a2 • ■ ■ a„, where {ai, a2-, ■ ■ ■ ,an} are 
similar but uncorrelated umbrae. Moments of a" can be easily recovered from its 
definition. Indeed, if the umbra a represents the sequence 1, ai, 02, . . . , then 

for nonnegative integers k and n. The umbra 7 is said to be multiplicative inverse 
of the umbra a if and only if aj = u. Recall that, in dealing with a saturated 
umbral calculus, the multiplicative inverse of an umbra is not unique, but any 
two multiplicative inverses of the same umbra are similar. From the definition, it 
follows: 

o-ngn — 1 Vn =: 0, 1, 2, . . . that is (/„ = — , 

where a„ and 5„ are moments of a and 7 respectively. The multiplicative inverse 
of an umbra a should be denoted by a'^~^' , but in order to simplify the notation 
and in agreement with our intuition, in the following we will use the symbol 1/a. 

3. fc-STATISTICS VIA COMPOUND POISSON R.V.'S 

In this section we resume previous results of the authors ( Di Nardo et al.Ll2008bl ). 
useful to simplify the subsequent reading. 

The z-th fc-statistic ki is the unique s ymmetric unbiased esti mator of the cumu- 



lant Ki of a given statistical distribution (jStuart and Ordl . 119871 ). that is E[ki] = k. 



By virtue of (|2.1ip , in umbral terms we shall write 



In lDi Nardo et al.l (|2008br ). fc-statistics have been related to cumulants of compound 



Poisson r.v.'s by the following theorem. 

Theorem 3.1. If Ci{y) ~ E[{n.x-y-P-ay],i = 1,2, . . . then 

(3.1) ix.ay c, c. (^ 



The statement of Theorem l3.1l reauires some more remarks. As stated in lDi Nardo and Senato 



()2006al) . the umbra 

{X-y-(3).a = x-{y-P-a) 

is the cumulant umbra of a polynomial a-partition umbra, the latter corresponding 
to a compound Poisson r.v. of parameter y. The polynomial umbra 

n.{x-y-f3).a = n.x.{y-l3.a), 

is therefore the sum of n uncorrelated cumulant umbrae of a polynomial a-partition 
umbra. Thus, Theorem 13.11 states that cumulants of a can be recovered from the 
moments of n.(x.j/./3).a, by a suitable replacement of the indeterminate y. 

Usually fc-statistics are expressed in terms of the r-th powers of the data points 
Sr = X]i=i -^i- I^ order to recover this expression for fc-statistics in umbral terms, 
it is sufficient to express the polynomials Ci{y) in terms of power sums n.a^ = 
a^L -|- • • • +a^. To this aim, the starting point is to express the moments of a generic 
umbra such as n.{'-fa), with 7 6 A, in terms of r-th power sums n.a^. 
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Theorem 3.2. Ifa,-fiE A, then 

(3.2) [n.{^a)Y ^ ^ d^{x.jUn.aY' [n.a^^ • • • 

Ahi 

withX^ (l'■l,2'■^...). 

Equivalence (|3.2p is the way for expressing the polynomials Ci{y), umbrally equiv- 
alent to moments of n.x.y.p.a, in terms of r-th power sums n.a^ . Indeed, as 

n.x-y-P.a = n.[{x-y-(3)a] 

(see (31) in lOi Nardo and Senatd l2006a[ ) . we can use equivalence p.2p . with 7 



replaced by {x-y-f3)- This is the starting point to prove the following result, by 
which the fast algorithm for fc-statistics can be easily recovered. 

Theorem 3.3. In K[y], let 

n 

(3.3) pn{y) = ^(-l)'=-i(fe - 1)! S{n, k) y\ 

where S{n, k) are the Stirling numbers of second type. For every a G A we have 



(3.4) ^^,ayc^Y.d,pJ^){n.a 



-. n.Y 



with X = (l'■^2'■^ . . .) andpxiy) = [pi{y)Y'[P2{y)Y' ■■■ ■ 

For the proofs of Theorems [3?2l and [3^31 see (JDi Nardo et al.l . l2008bl ). 



3.1. Polykays via compound Poisson r.v.'s. The symmetric statistic fcr,...,t 
satisfying 

E[kr,...,t] ^ Kr- ■■ Kt, 

where Kr, . . . ,Kt are cumulants, genera l izes fc -statistics and these were originally 
called generalized k-statistics bv lDressell ( 19401 ). Later they were called polykays by 



IXukey (1950) 

As a product of uncorrelated cumulants, the umbral expression for a polykay is 
simply 

(3.5) kr_t^{x-ar---{x'-a')\ 

with Xt ■ ■ iX' being uncorrelated singleton umbrae, and a, . . . , a' uncorrelated um- 
brae satisfying a= ■ ■ ■ = a 



Also polykays are usually expressed in terms of power sums. In iDi Nardo et al 



(|2008bl ). starting from (|3.5p . we have given a compressed umbral formula in order to 



express polykays in terms of power sums. Such a formula has been implemented in 
Maple and the resultin g computational times have been presented and discussed in 



Di Nardo et al.l (J2008a[ ) . Despite the compressed expression for this umbral formula, 
the computational cost of the resulting algorithm involves the Bell numbers and so 
increases too rapidly with r + - ■ ■+t. A different umbral formula may be constructed 
by generalizing the results of the previous section. Such a formula is not quite 
expressible in a compressed form, but speeds up the algorithm for building polykays 
(see the computational times in Table 1 in Section 6). 

For plainness, in the following we just deal with two subindexes fc^,*, the gener- 
alization to more than two being straightforward. 
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Let US consider a polynomial umbra whose moments are all equal to y, up to an 
integer fc, after which the moments are all zero. Let us therefore define the umbra 
5y^k satisfying 

. ^f {x-y-Py i = o,i,2,...,fc, 
^y^^ -\ i>k. 

Lemma 3.4. Let r,t be two nonnegative integers. If k — max{r, i}, then 
(3.6) [n.{Sy^ka)Y+' ^ ^ 2/''^+'^''N..+.„rfA+,aA+,. 

(Al-r,77l-t) 

Proof. By equivalence (|2.5p . we have 



?H(r+t) 

since Sy^k and a are uncorrelated. The result follows by observing that (Sy^k)^ — 
for any ^ satisfying A + ?7 < ^, where A h r, 77 h i and < represents the lexicography 
order on integer partitions. D 

Let Pr.tiy) be the polynomial obtained by evaluating the right hand side of (|3.6p . 
that is 

The proof of the following theorem is straightforward and allows us to express 
products of uncorrelated cumulants by using the polynomials p^,* (y) ■ 

Theorem 3.5. If qr,t is the umbral polynomial obtained via pr^tiy) by replacing 

^ ■ ^ {n.xy^+'''^ dx+,Y 

then 

{x.aYix'-a'Y ^ qr,t- 

These two last results are sufficient to express the polykay kr^t in terms of power 
sums. The steps are summarized in the following: 

i) we apply Theorem l3.2l to the polynomial umbra n.{5y^k a), with k — max{r, t}, 
in order to link the polynomials Pr,t{y) to power sums, that is 

(3.8) [n.{5y^ua)Y+'^PrAy)^ E rf?(x-'5,.fc)«(n.a)^^ (n.a^)^^ • • • ; 

«l-(r+t) 

ii) we evaluate the cumulants of the umbra Sy^u by means of (j2.10p . by recalling 

that moments corresponding to powers greater than k are zero; 
Hi) we replace occurrences of y^'^+'^'j in p.Sp by p.7p . thanks to Theorem 13.51 

The steps i) - Hi) are the building blocks of the fast algorithm for generating 
polykays. 
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4. The multivariate case: umbrae indexed by multisets 

In order to consider multivariate fc-statistics and polykays, we need of the no- 
tion of multivariate moments and multivariate cumulants of an umbral monomial. 
The umbral tools nece ssary to deal with multivariate moments are introduced in 



Di Nardo et alJ (j2008b(). Here we recall basic notation and equivalences in order to 



generalize Theorems 13.21 and 13.51 to the multivariate case. 

A multiset M is a pair (M, /), where M is a set, called the support of the multiset, 
and / is a function from M to nonnegative integers. For each fi G M, f{fi) is the 
multiplicity of /i. The length of the multiset (M, /), usually denoted by |M|, is the 
sum of multiplicities of all elements of M, that is 

\M\ = J2 /(/^)- 

When the support of M is a finite set, say M — {/ii, /X2, . . . , ^ik}^ we will write 

]\'T — /..(/(a*!)) ,,(/(A'2)) ..(/(Mfc))! „-. ]\,r _ i ,, ,, ,, ,, \ 

M — {fj,^ ,fi2 ,... ,fi^ I or M — l /xi, . . . ,fii , . . . , ij,k,- ■ ■ jlJ-kJ - 

/(mi) /(Mfc) 

For example the multiset 

Af = {a,...,a} = {aW} 

i 

has length i, support M — {a} and /(a) — i- Recall that a multiset Mi = {Mi, fi) 
is a submultiset of M = {M, /) if M^ C M and /,(/Lt) < f{ii), V/i G Mi. 

Example 4.1. If M — {a, a, 7, (5, (5} then Mi — {a, a} is a submultiset with 
support A?i = {a} and fi{a) — 2. Also M2 ~ {a,S,S} is a submultiset with 
support A?2 — {ct,S} and /2(q:) — l,/2((5) = 2. 

In the following we set 

(4.1) 

m = n ^^^"^ ("•^)^^ = n (^•A')^^^^ [n.ixi^)]M = n [^•(x/^)]^^''^- 

For instance, if Af ~ {a^*^} then qm — a', in.a)M = {n.aY, [n.{xa)]M — ["-(XQ^)]'- 
A subdivision of a multiset Af is a multiset S = {S,g) of fc < \M\ non empty 
submultisets Mi = {Mi, fi) of M satisfying 
i) utiM, = M; 
««J ELi /«(m) = /(Ai) for any /x G Af . 

Example 4.2. Multisets ^i = {{a, 7}, {a}, {S, S}} and ^2 = {{a, 7, (5}, {a, S}} are 
subdivisions of Af = {a, a, 7, J, (5}. 

By extending the notation (|4.ip . we set 

(4.2) M5= n 4!^'^ i^■^')s^ n ("-mmj^^''-^ 

Mies MiGS 

(4.3) [«.(xm)]5= n [^-(xMmJ]^^*''^. 

Mies 
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Example 4.3. If M = {fii, fii, ^12}, then Si = {{^J■l},{^J'l, ^J■2}} is a subdivi- 
sion of M. The support of Si consists of two multisets, Mi — {fii} and M2 = 
{^1, ^2}, each of one with multiplicity 1, therefore [n.{x^^)]Sl = "■■(xMMi)"'-(x/^M2)- 
Since n.(xMMi) = n.{xfJ.i) and ^.(xMAfs) = n-ixfJ^im), we have [n.{xfi)]si = 
n.{xfii)n.{xfiiH2)- 

We may construct a subdivision of the multiset A'l by a suitable set partition. 
Recall that a partition tt of a set C is a collection tt = {Bi, B2, ■ ■ ■ -.Bk} with k < n 
disjoint and non-empty subsets of C whose union is C. We denote by n„ the set 
of all partitions of C. Suppose the elements of M to be all distinct, build a set 
partition and then replace each element in any block by the original one. By this 
way, any subdivision corresponds to a set partition tt and we will write 5^. Note 
that ISttI — |vr| and it could be 5*,^ = S^.^ for tti ^ 7r2, as the following example 
shows. 

Example 4.4. If A/ — {a, a, 7, S, 6} suppose to label each element of A/ in such a 
way C = {«!, a2,ji,Si,62}- The subdivision Si — {{a, 7}, {a}, {5,S}} corresponds 
to the partition tti = {{ai,7i}, {02}, {Si,S2}} of C. We have 15*11 = |7ri|. Note that 
the subdivision Si also corresponds to the partition 7r2 = {{a2,7i}, {ai}j {<5i, '^2}}- 

In the following, we denote by n-^ the number of set partitions in ni^/i corre- 
sponding to the same subdivision 5" of the multiset M. 
If M ~ {a*-'-*}, then subdivisions are of type 

(4.4) 5 = {M,...,M,{a(2)},...,{a(2)},...}, 

^ V "^ ^ V "^ 

n r2 

with ri -h 2 r2 H = i. The support oi S is S = {{a}, {a^"^^}, . . .}, so that (g^D 

and (|4.3p give 

(4.5) {n.a)s = {n.aY' {n.a^Y^ ■ ■ ■ [n.(xa)]s = [n.ixa)Y' [n.ixa^W' ■■■ ■ 
Before ending this summary, we recall one more notation. Suppose S* is a subdi- 
vision of the multiset M of type S = {A/|^^^^^^\m^^^*^"^\ . . . ,MJ^^*^^^^}. By the 
symbol fi we denote 

(4.6) m-^^(mmJ-^('^^'---(/.'m,)-''(^"'\ 

where ^Mt s-re uncorrelated umbral monomials. Observe that also /i is a multi- 
plicative function, that is if Si and S2 are subdivisions of M then 

where Si + S2 denotes the disjoint union of Si and ^2. If Af — {a*^*-'}, then 

(4.7) ax = a-^ 

with A — (fi, 2''^, . . .). The notation (14. 6p cames in handy in order to evaluate 
umbral polynomials like {n.fiJM in terms of moments of the umbral monomials 
running in M. Indeed, observe that a different way to write equivalence (|2.3|) follows 
by using subdivisions S'^r of M = {a^*-*}, that is 

(4.8) (n.ay c^ ^ (n.x)l^"la-^'. 
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Replace the multiset M — {a^'^} by a generic multiset M, then it follows 

•s,,|,,.s„ 

TTGHi 



(4.9) {n.^l)M ^ Yl ('^•X)''^'"A*- 



We end the section by adding one more remark. Let A^ be a submultiset of M. By 
the symbol (/iM)jv we denote the monomial umbra fJ-MnN- This notation allows us 
to generalize equivalence (|4.9p to umbral polynomial {n.iJ,M)N with iV C M, that 
is 

(4.10) {n.tiM)N^ Y. ('^•X)'^"'(mm)-^'. 

7ren|jv| 

5. Multivariate fc-STATiSTics via compound Poisson r.v.'s 



In lDi Nardo et al.l ( 2008bl ). multivariate moments and multivariate cumulants of 
an umbral monomial arc introduced. Let M = {^Y^'^'^',!!^, ■ ■ ■ -^lA} be 
a multiset of length i. A multivariate moment is the element of K\y\ corresponding 
to the umbral monomial /ia/ via evaluation E^ that is 



E[^im] = m-ti... 



where tj ~ fil^-j) for j — 1, 2, • • • , r. The corresponding multivariate cumulant is 
the element of K[y] satisfying 

(5.1) i?[(x.M)Af] = Kti...t.. 

For example, if M = {a'^*'} then {x-I^)m — ix-c^Y- As for /c-statistics, the umbra 
(x.y./?./x)M is the cornerstone for building efficiently multivariate /c-statistics. Let 
us observe that the evaluation of {x■y■P■^^)M gives the umbral counterpart of joint 
cumulants of a multivariate compound Poisson r.v. with parameter y. We need 
therefore to characterize the evaluation of [n.(x.y./3./.t)]M- 

Proposition 5.1. Let M be a multiset of length i. The umbra [n.(x-y-/3-M)]-A/ ** 
umbrally equivalent to the umbral polynomial 



(5.2) CM{y) = V (n.x)l^"lyl^-lM 



s.\„\s^\,,.s. 



where S^^ are the subdivisions of the multiset M corresponding to the partitions tt. 



Proof. In equivalence (|4.9p . replace the generic umbral monomial /i by x-V-P-l^- We 
have 

(5.3) [n.{x-y -P -11)^1 ^ E("-X)"'^'[(X-2^-/^)^]'^"' 

TreRi 

where the form on the right-hand side is worked out by means of the equivalence 
X-y-(3-fJ- = (x-?/-/9)m- The umbrae (x-y-P) and /i are uncorrelated, so that 

[(x.y./3)M]-^' = (x.y./3)-^>-^". 

Let S^ = {m[^^^'^'^\ M!f^^'^^\ ..., MJ^*^'^^'^^}. Equivalence jEl follows from (jO)) . 
by observing that 

(x.y./3)-^— yl^-l, 
since the umbra x-V-P has moments all equal to y, and '^g{Mi) = 15*^1. D 
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Theorem 5.2. If CM{y) are the umhral polynomials given in ^5.2\) . then 

(5.4) CM I ^ ) ^ (x-Ai)M- 

\n.xj 

Proof. The result follows directly from (j5.2p by replacing y by ^^^ and by recalling 
that 

(5.5) (x.//)m^ E(^-:^)"'"'a^-^"' 

an equivalence which follows from (j4.9p by replacing n by x- D 

Theorem 5.3. Lef Pnix) — [pi{x)Y^ [p2{x)Y^ • • • , with Pn{x) given in k3.!^) and it 
a partition o/n|jv/| with r\ blocks of cardinality 1, r2 blocks of cardinality 2, and so 
on, then 

(5.6) (x.m)m ^ Xl P^ ( ^ ) ("-m)s^- 

Proof. First observe that [n.(x.j/./3.^)]M = ['^.((x.y./3)A*)]M, so that 

CAiiy) ^ [n.{{x.y.p)fi)]M- 

We need to express [n.{{x.y.P)fJ-)]M in terms of power sums. To this aim, note that, 
by using (|4.5p and (|4.7p . equivalence (|3.2p can be rewritten as 

(5.7) [n.(7a)]M~ ^ (x.7)-^"(^.«)s., 

where Sj^ is a subdivision of the multiset M — {a'-*-'}. Replace M by any multiset. 
Equivalence (|5.7p becomes 

(5.8) K(7Ai)]Af- E(^-^)^^'("-^)s- 

In equivalence (|5.8[) . suppose to replace 7 by x-y-P, then 

(5.9) [n.((x.2;./3)M)]M ^ ^ (x-X-2/-/3)-^'(ri./^)s,. 

Let ^. = {M^''^^^", m(^('^^», . . . , MJ^(*'^»}, then 

(x.x.y./?)-^' ^ [{x-x-v-P)mA'^'''^ ■ ■ ■ [{x'-x'-y-P')M,V^''^l 

Observe that 

{x-x-y-f3)M^ = n ix-x-y-py^''^ = ix-x-y-P)^''''^, 

so that 

E[{x.x-y-f3y'-] = [PiMjyW^'''^ ■ ■ ■ biM,i(2/)]^(*'^) ^P.iy) 

if 5',r is the subdivision corresponding to the partition tt. By replacing y by (x.x)/("'-x)j 
equivalence (|5.6p is proved. D 

Recall that the multivariate fc-statistics are the unique symmetric unbiased es- 
timators of joint cumulants. Since these estimators are umbrally equivalent to 
(X-m)m, with a suitable choice of the multiset M, the expression for multivariate 
/c-statistics in terms of power sums is given by the right-hand side of equivalence 
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5.1. Multivariate polykays via compound Poisson r.v.'s. The symmetric 
statistic kti...t,.;...;ii...i^ satisfying 

where Ktx...t.r i ■ • ■ i Hh...im ^^^^ muhivariate cumulants, generahzes polykays. As prod- 
uct of uncorrelated multivariate cumulants, the umbral expression for a multivariate 
polykay is simply 

(5.10) K...u■,...■M...l^^{x■^i)T■■■{x'■^^')L, 

with X, . . . , x' being uncorrelated singleton umbrae and T, . . . , L multisets of umbral 
monomials such that 

Also for multivariate polykays we have given a compre ssed umbral formula in terms 



of multivariate power sums (jDi Nardo et al.l . l2008bf ). Such a formula has been 



implemented in Maple and the resulting computational times have been presented 



and discussed in iDi Nardo et al.l (|2008a[ ). Here we generalize the procedure given 
for univariate polykays, speeding up the algorithm. 

For plainness, in the following we just deal with two multisets T and L, the 
generalization being straightforward. 

Let N be the disjoint union of all submultisets respectively of T and L and 
suppose to denote by + the disjoint union of two multisets. 

Example 5.1. Let T ~ {/ii, 112} and L = {mi}. The disjoint union of all submul- 
tisets respectively of T and L is iV = {{/ii,/i2}, {/^i}: {^2}, {mi}}- H T — {^i, fi2} 
and L = {^3} then N = {{^1, ^2}, {^i}, {^^2}, {a^s}}- 

As before, we need of a polynomial umbra, indexed by a suitable multiset, which 
behaves as a filter on subdivisions of T + i, by deleting those which are not disjoint 
unions of subdivisions respectively of T and L. Suppose therefore Mi = (Mi,g), a 
submultiset of T + L. Let us define the umbra 5y^N satisfying 

(^^^^ (s \ ~/ ^ if M, ^ iV, 

^^ > IS,JVJM, - <y (^.y.^)j,,^^ ^ (x-y-/5)l*''l ^ y otherwise. 

Let Si, be a subdivision of T + i, and S-^ and Sr subdivisions respectively of T 
and L, without taking into account the distinct labels. Via (|5.1ip . the following 
equivalence results 

(5.12) (5,,^)-^—'^° if5. 5^5. + 5., 



(x.y./S)"^" ~ y''^"' otherwise, 

where < denotes the natural extension to subdivisions of the refinement relation 
defined on the lattice of set partitions. 

Lemma 5.4. If Sy_N is the umbra defined in i5.ll]) . then 

(5.13) [n.{Sy,^ ^)]t+, c. Y: (n.x)l^'l+l^^lyl^'l+l^^lA*-(^'-*-^^). 

(Tren|r|,ren|i|) 

Proof. From (I4.10p . we have 

(5.14) [n.{Sy^N t-^)]T+L ^ Y. {n.x)^'"^6y% f^-^-, 



u£n 



T + i 
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since Sy^N is uncorrelated with any element oi T + L. Due to ()5.12[) . in the sum on 
the right hand side of (I5.14p . the addends which give a non-zero contribution are 
only those corresponding to subdivisions which can be split in a subdivision of T 
and a subdivision of L, that is 



(5.15) [n.idy,N t^)]T+L ^ J2 ('^•X)'^''+'^^' ^,^ -5^^ M->-' 



s s 

V"-A; "y,Ar ^y,N P' " "■ 

(7ren|r|,ren|i|) 
Observing that S'^^ ~ yl'^'"' and S'^]^ ~ y''^^', the result follows immediately since 

By recalling that subdivisions corresponding to different partitions can be equal, 
equivalence ()5.13|) may be rewritten as 

(5.16) [n.{Sy^^ ^,)]r+, ^ ^ n.+. (n.x)l^-l+l^^l yl^'l+l^^l M-(^-+^^\ 

where nT^+T is the number of set partition pairs (tt, t) corresponding to subdivision 
Stt + St- Assume 

(5.17) pTAy)= E -.+.(-x)i"-i+i"^iyi^-i+i"^iM-^"-+"^). 

Thanks to (|5.16p . the next theorem is proved by simple calculations and allows us to 
express products of uncorrelated multivariate cumulants by using the polynomials 
PT,L{y)- 

Theorem 5.5. Suppose n^ (respectively tIt) the number of set partitions in '^\t\ 
(respectively 'R\l\) corresponding to the subdivision S-^ (respectively Sr) and n^r+r 
the number of set partitions in niT^^^i corresponding to the subdivision S^^ + Sr- If 
Qt.l is the umbral polynomial obtained from pt.l{v) by replacing y'^'^'^'^'^' with 

.g ^g^ (X-X)'^'^'(x'-XO'^-'n^»r 

then 

(x-m)t (x'-m')l '=^qT,L- 
Proof. Due to (|5.5p . product of multivariate cumulants may be written as: 

(x.m)t (x'.A^')l ^ E (X.X)I^"I(X'.X')I'^^I l^-^'-^'^l 

(iren|T|,Ten|i,|) 

The previous equivalence can be rewritten as 

(5.19) (x.Ai)T(x'y)L ^ E "-«-(X-x)I''"I(x'.x')"'^'m-^'''+''^'- 

The result follows by comparing the right hand side of (|5.19p with PT,L{y) in (|5.17p 
where yl'^^l+l-^^l has been replaced by (|5.18p . D 

Via equivalence (|5.8p . the following equivalence holds 

(5.20) [n.{5yMli)]T+L^ E iX-5y,N\^''{n.p)s^, 

by which it is possible to express multivariate polykays in terms of power sums. 
The algorithm is summarized in the following: 
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i) by equivalence (j5.20|) . we evaluate n.{5y^N iJ^) in terms of power sums in 

order to link the polynomials PL.jiy) to power sums; 
a) we evaluate the cumulants of the umbra <^j,,jv by means of 



{X.5y,N)M^ E(^-^)"''''^!f^ 



E 



which is an obvious generalization of equivalence (|5.5p : 
in) we replace occurrences oi y\^^\+\^^\ in {x-^y.N)M by (|5.18p . 



6. Computational comparisons 

Tables 1 and 2 show comparisons of computational times among four different 
software packages. The first one, which we call AS algo rithms, has been implemente d 
in Mathematica and refers to procedures explained in ( Andrews and Staffordll2000l ) . 
see http://fisher.utstat.toronto .edu/david/SCSI / chap. 3. nb. The second one refers 
to the package MathStatica ( Rose and Smithl . 120021 ). Note that in this pack- 
age, there are no procedures devoted to multivariate polykays. The third pack- 
age, named Fast algorithms, has been implemented in Maple 10.x by using the 
results of this paper. The last procedure, named Polyk, has been described in 
(jDi Nardo et al.ll2008a[ ). Let us remark that, for all the considered procedures, the 
results are in the same output form and have been performed by the authors on 
the same platform. To the best of our knowledge, there is no R implementation for 
fc-statistics and polykays. 



Table 1. Comparison of computational times in sec. for k- 
statistics and polykays. Missed computational times "means 
greater than 20 houres" . 



kt,...,i 


AS Algorithms 


MathStatica 


Fast-algorithms 


Polyk-algorithm 


fcs 


0.06 


0.01 


0.01 


0.08 


k7 


0.31 


0.02 


0.01 


0.03 


kg 


1.44 


0.04 


0.01 


0.16 


fell 


8.36 


0.14 


0.01 


0.23 


fci4 


396.39 


0.64 


0.02 


1.33 


fcl6 


57982.40 


2.03 


0.08 


4.25 


fcl8 


- 


6.90 


0.16 


13.70 


feo 


- 


25.15 


0.33 


42.26 


^22 


- 


81.70 


0.80 


172.59 


k2i 


- 


359.40 


1.62 


647.56 


fe26 


- 


1581.05 


2.51 


3906.19 


fc28 


- 


6505.45 


4.83 


21314.65 


k-i,2 


0.06 


0.02 


0.01 


0.02 


^4,4 


0.67 


0.06 


0.02 


0.06 


^5,3 


0.69 


0.08 


0.02 


0.07 


fc7,5 


34.23 


0.79 


0.11 


0.70 


^7,7 


435.67 


2.52 


0.26 


2.43 


^9,9 


- 


27.41 


2.26 


23.32 


fcio.s 


- 


30.24 


2.98 


25.06 


^4,4,4 


34.17 


0.64 


0.08 


0.77 
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The Polyk algorithm, introduced in lPi Nardo et ahl ( 2008ar ). has the advantage 
to give fc-statistics, muhivariate fc-statistics, polykays and muhivariate polykays, 
depending on input parameters. That is one algorithm for the whole matter. The 
computational times of Polyk are better than those of AS algorithms in all cases. 
Polyk works better than MathStatica for polykays but is not competitive for k- 
statistics. MathStatica has not a procedure for multivariate polykays. 



Table 2. Comparison of computational times in sec. for multi- 
variate /c-statistics and multivariate polykays. For AS Algorithms 
and Polyk-algorithm, missed computational times means "greater 
than 20 houres". For MathStatica, missed computational times 
means "procedures not available" . 



kt,...U;h...l^ 


AS Algorithms 


MathStatica 


Fast-algorithms 


Polyk-algorithm 


^3 2 


0.25 


0.03 


0.01 


0.03 


fc4 4 


28.36 


0.16 


0.02 


0.34 


^5 5 


259.16 


0.55 


0.06 


1.83 


^6 5 


959.67 


1.01 


0.16 


4.61 


kee 


- 


2.20 


0.28 


12.08 


^7 6 


- 


4.01 


0.53 


33.22 


fcyy 


- 


8.49 


1.04 


95.19 


he 


- 


7.37 


1.09 


91.80 


^8 7 


- 


14.92 


2.19 


300.60 


^3 3 3 


1180.03 


0.88 


0.47 


2.90 


^4 3 3 


- 


2.00 


0.40 


9.26 


^4 4 3 


- 


4.80 


0.94 


34.20 


^444 


- 


13.53 


2.30 


155.03 


fell; 11 


0.05 


- 


0.01 


0.01 


^2 1; 11 


0.20 


- 


0.01 


0.03 


fc2 2;ll 


1.22 


- 


0.03 


0.05 


fc2 2;2 1 


6.30 


- 


0.08 


0.09 


fc2 2;2 2 


33.75 


- 


0.14 


0.30 


fc2 1;2 1;2 1 


78.94 


- 


0.22 


0.45 


fc22;ll;ll 


30.01 


- 


0.14 


0.20 


fc22;21;ll 


126.19 


- 


0.28 


0.55 


fc22;21;21 


398.42 


- 


0.55 


1.66 


^2 2; 2 2; 11 


464.45 


- 


0.61 


1.59 


^2 2; 2 2; 2 1 


1387.00 


- 


1.25 


5.52 


^2 2; 2 2; 2 2 


3787.41 


- 


2.91 


20.75 



Finally, from Table 1 and 2, it is evident that there is a significant improve- 
ment of computational times realized by the Fast-algorithms, compared to the 
other three packages. The Fast-algorithms are available at the following web page 
http://www.unibas.it/utenti/dinardo/fast.pdf. 

In Table 3, we quote computational times for the fc-statistics, the polykays 
and the multivariate ones given in Tables 1 and 2, obtained with forthcoming 
MathStatica release 2, by using Mathematica 6.0, on Mac OS X, with Mac Pro 
2.8GHz (Colin Rose, private communication) 
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Table 3. Computational times in sec, for the /c-statistics, the 
polykays and the multivariate ones given in Tables 1 and 2, ob- 
tained with forthcoming MathStatica release 2. 



kt,....i 


MathStatica 2 


kti...tr;ll...lrr. 


MathStatica 2 


h, 


0.008 


k3 2 


0.012 


k7 


0.017 


k4 4 


0.009 


kg 


0.039 


ksb 


0.345 


fell 


0.084 


ke>5 


0.592 


fci4 


0.329 


k&a 


1.230 


fcl6 


0.917 


k7 6 


2.107 


fcl8 


2.804 


k7 7 


4.215 


^20 


9.363 


^8 6 


3.595 


fc22 


32.11 


^8 7 


7.359 


fc3.2 


0.012 


^3 3 3 


0.529 


kiA 


0.044 


^4 3 3 


2.552 


k7,5 


0.434 


^4 4 4 


6.926 


k7,7 


1.288 


fcll;ll 


0.006 


k<),9 


11.89 


^2 1; 11 


0.014 


kio,8 


12.39 


A;2 2;ll 


0.038 


^4,4,4 


0.359 


fc2 2;2 1 


0.085 






fc2 2;2 2 


0.020 






fc2 1;2 1; 2 1 


0.227 






^2 2; 1 1; 1 1 


0.154 






fe2 2;2 1; 1 1 


0.413 






fc2 2;2 1; 2 1 


0.928 






fc2 2;2 2; 1 1 


1.063 






^2 2; 2 2; 2 1 


2.622 






^2 2; 2 2; 2 2 


6.402 
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